The applicability of the perturbation based privacy preserving data mining for real-world data
نویسندگان
چکیده
The perturbation method has been extensively studied for privacy preserving data mining. In this method, random noise from a known distribution is added to the privacy sensitive data before the data is sent to the data miner. Subsequently, the data miner reconstructs an approximation to the original data distribution from the perturbed data and uses the reconstructed distribution for data mining purposes. Due to the addition of noise, loss of information versus preservation of privacy is always a trade off in the perturbation based approaches. The question is, to what extent are the users willing to compromise their privacy? This is a choice that changes from individual to individual. Different individuals may have different attitudes towards privacy based on customs and cultures. Unfortunately, current perturbation based privacy preserving data mining techniques do not allow the individuals to choose their desired privacy levels. This is a drawback as privacy is a personal choice. In this paper, we propose an individually adaptable perturbation model, which enables the individuals to choose their own privacy levels. The effectiveness of our new approach is demonstrated by various experiments conducted on both synthetic and real-world data sets. Based on our experiments, we suggest a simple but effective and yet efficient technique to build data mining models from perturbed data. 2007 Elsevier B.V. All rights reserved.
منابع مشابه
An Effective Method for Utility Preserving Social Network Graph Anonymization Based on Mathematical Modeling
In recent years, privacy concerns about social network graph data publishing has increased due to the widespread use of such data for research purposes. This paper addresses the problem of identity disclosure risk of a node assuming that the adversary identifies one of its immediate neighbors in the published data. The related anonymity level of a graph is formulated and a mathematical model is...
متن کاملPrivacy-preserving data visualization using parallel coordinates
The proliferation of data in the past decade has created demand for innovative tools in different areas of exploratory data analysis, like data mining and information visualization. However, the problem with real-world datasets is that many of their attributes can identify individuals, or the data are proprietary and valuable. The field of data mining has developed a variety of ways for dealing...
متن کاملIntroducing an algorithm for use to hide sensitive association rules through perturb technique
Due to the rapid growth of data mining technology, obtaining private data on users through this technology becomes easier. Association Rules Mining is one of the data mining techniques to extract useful patterns in the form of association rules. One of the main problems in applying this technique on databases is the disclosure of sensitive data by endangering security and privacy. Hiding the as...
متن کاملA Privacy Preserving Data Mining Scheme Based on Network User’s Behavior
The privacy preserving data mining has become a research hot issue in the data mining field. The server log of the Web site has preserved the page information visited by users. If the page information is not protected, the user’s privacy data would be leaked. Aiming at the problem, it discusses the privacy preserving problem based on the user’s behavior in the Web data mining, and then introduc...
متن کاملOn Random Additive Perturbation for Privacy Preserving Data Mining
Title of Thesis: On Random Additive Perturbation for Privacy Preserving Data Mining Author: Souptik Datta, Master of Science, 2004 Thesis directed by: Dr. Hillol Kargupta, Associate Professor Department of Computer Science and Electrical Engineering Privacy is becoming an increasingly important issue in many data mining applications. This has triggered the development of many privacy-preserving...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Data Knowl. Eng.
دوره 65 شماره
صفحات -
تاریخ انتشار 2008